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<N . Abstract 

1— j ■ In this paper we discuss testing for an interaction in the two-way ANOVA with just 

\ one observation per cell. The known results are reviewed and a simulation study is 

performed to evaluate type I and type II risks of the tests. It is shown that the Tukey 
and Mandel additivity tests have very low power in case of more general interaction 
f-H . scheme. A modification of Tukey's test is developed to resolve this issue. All tests 

" mentioned in the paper have been implemented in R package Additivity Tests. 

Key words: 

two-way ANOVA, additivity tests, Tukey additivity test 



> 

m 
00 
00 



1 Introduction 



o: 

In many applications of statistical methods, it is assumed that the response 
variable is a sum of several factors and a random noise. In a real world this may 
not be an appropriate model. For example, some patients may react differently 
to the same drug treatment or the influence of fertilizer may be influenced 
by the type of a soil. There might exist an interaction between factors. A 
testing for such interaction will be referred here as testing of additivity 
hypothesis. 

If there is more than one observation per cell then standard ANOVA tech- 
niques may be applied. Unfortunately, in many cases it is infeasible to get 
more than one observation taken under the same conditions. For instance, it 
is not logical to ask the same student the same question twice. 
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We restrict ourselves to of two factors, i.e. two-array model, when the 

response in i th row and j th column is modeled as 



yij = /i + a,i + /3j + jij + e 



a, J 



where 

Y. a i = Ufa =27« = £7tf = 

i j i j 

and the are normally distributed independent random variables with zero 
mean and variance a 2 . 



To test the additivity hypothesis 

H : lij = i = 1, . . . , a, j = 1, . . . , 6, 



(2) 



a number of tests have been developed . The Secti on 2 r ecollec ts the known 
additivity tests, see also lAlin and Kurtl (120061 ) and iBoik! (jl993al ). 



In Section 3 the power of the tests described in Section 2 is compared by 
means of simulation. While Tukey test has relatively good power when the 
interaction is a product of the main effects, i.e. when 7^ = ka^j (k is a real 
constant), its power for more general interaction is very poor. 



It should be reminded that iTukey fll949h did not originally propose his test 
for any particular type of interaction. Actually after a small modification de- 
rived in Section 4 the power of the test improves dramatically. There exist 
some issues when a sample size is not large enough that may be resolve by 
a permutation test or bootstrap. 



2 Additivity Tests 



This section recalls the known additivity tests of hypothesis (T5]) in model ([T]). 
Let y.. denotes the overall mean, j/j. the i th row's mean and y.j the j th column's 
mean. The matrix R = [r^] will stand for a residual matrix with respect to 
the main effects 

nj = - yi. - y.j + y- 

The decreasingly ordered list of eigenvalues of RR T will be denoted by Hi > 
«2 > . . . > K min(a,b)-i, and its scaled versions equal 

Ui = K ' , i = 1,2, ...,min(a,6) - 1. 

l^k K k 



2 



If the interaction is present we may expect that some of uji coefficients will be 
substantially higher than others. 



Tukey test: Introduced in iTukeyl (Il949l ). Tukey test first estimates row and 
column effects and then tests for the interaction of a type 7^ = kotifij (k = 
implies no interaction). Tukey test statistic St equals 



where 



MSi, 



int 



S T = MS mt /MS e 



(EiEjVijivi- - //•) (y-. 
EiiVi- -y-) 2 Ej(y-j -y 



and 



EiEjivij - y-Y -«E 3 fe - y-) 2 -bEM- - V-) 2 - MS,, 



int 



(a-l)(6-l) - 1 



Under the additivity hypothesis St is F-distributed with 1 and (a— 1)(&— 1) — 1 
degrees of freedom. 



Mandel test: Introduced in iMandell (119611 ) . Mandel test statistic Sm equals 

Ei(zi - 1) 2 Ej(y-j - y-) 2 , Ej {{ya - m) - z%(y.j - y )f 



Sm 
where 



a - 1 



Zi :-- 



I 



J2jyij(y-j -y-) 
Ejiy-j-y-) 2 



(o-l)(6-2) 



Under the additivity hypothesis Sm is F-distributed with a — 1 and (a — 1) 
(6 — 1) degrees of freedom. 



Definitions of the three later tests slightly differ from their original versions. 
For a, b fixed, a simulation may be used to get the critical values. 



Johnson — Gray bill test: Introduced in I Johnson and Graybilll (119721 ). John- 
son - Graybill test statistic is just Sj = U\. The additivity hypothesis is 
rejected if Sj is high. 



Locally best invariant (LBI) test: See iBoiki (jl993bl ). LBI test statistic 
equals (up to a monotonic transformation) 



S, 



min(a,6) — 1 

i=l 



The additivity hypothesis is rejected if Si is high. 
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Tusell test: See iTuselll (119901 ) . Tusell test statistic equals (up to a constant) 



min(a,b)— 1 

Su = n 

The additivity hypothesis is rejected if Su is low. 

As will be verified in the next section, Tukey and Mandel tests are appropriate 
if jij = kctiPj while Johnson - Graybill, LBI and Tusell omnibus tests are 
suitable in cases of more complexed interactions. 



3 Simulation Study 



In this section simul ation results about power of the additivity tests are pre 



sented. According to ISimeckova and Raschl (120081 ) the type-I-risk of the tests 



mentioned in Section 2 is not touched even when one of the effects in ([I]) is con- 
sidered as random. The mixed effects model used for the simulation study is 
as ([1]) where fi, on are constants, and (3j are independent normally distributed 
random variables with zero mean and variance <r|. 

Two possible interaction schemes were under inspection: 

A) 7ij = koiiftj where k is a real constant. 

B) jij = kdiSj where 5j are independent normally distributed random variables 
with zero mean and variance cx§, independent of /3j and ey, and k a real 
constant. 

The 6ij are independent normally distributed random variables with zero 
mean, \l = 0, and unit variance, a 2 = 1. 

The other parameters are equal to /x = 0, cr| = 2, cr 2 = 1, a = 10, 

{a u aio) = (-2.03, -1.92, -1.27, -0.70, 0.46, 0.61, 0.84, 0.94, 1.07, 2.00). 

Two possibilities are considered for the b, either b = 10 or b = 50, and 10 dif- 
ferent values between and 12 are considered for the interaction parameter k. 

For each combination of parameters' values a dataset was generated based on 
the model (JT]), the tests of additivity were done and their results were noted 
down. The step was repeated 10 000 times. The estimated power of the test 
is the percentage of the positive results. All tests were done on a = 5% level. 

The dependence of the power on k is visualized in Figure 1. As we can see, 
while Tukey and Mandel tests outperformed omnibus tests for interaction A 
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Fig. 1. Power dependence on k, b (b = 10 left, b = 50 right) and interaction type 
(A up, B down). Tukey test solid line, Mandel test dashed line, Johnson - Graybill 
test dotted line, LBI test dot-dash line, Tusell test long dash line. 

and low k and b, they completely fail to detect the interaction B even for a 
large value of k and b = 50. Therefore, it is desirable to develop a test which 
is able to detect a spectrum of practically relevant alternatives while still has 
the power comparable to the Tukey and Mandel tests for the most common 
interaction scheme A. 



4 Modification of Tukey Test 

In Tukey test a model ([1]) 

Uij = /i + «i + (3j + jij + €ij = {i + oti + (3j + kaiPj + €ij (3) 

is tested against a submodel ([2]) Uij = fi + on + f3j + e^- The estimators of 
row effects &i = y^. — y.. and column effects f3j = y.j — y.. are calculated in the 
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same way in both models although the dependency of z/y on these parameters 
is not linear for the full model. 

The main idea behind a presented modification is that the full model ([3]) is 
fitted by a nonlinear regression and tested against a submodel = /i + a« + 
(3j + €ij by a likelihood ratio test. The estimates of row and column effects 
therefore differ for each model. 



4-1 Non-adjusted test 



Under additivity hypothesis the maximum likelihood estimators of parameters 
can be calculated simply as fi = y.., 6>i = yi. — y.. and (3j = y.j — y... Residual 
sum of squares equals 

Rss = 22 (yy - A - a* - h) = 2 2 ivij - Vi- - y-i + y-f ■ 

i j i j 



In the full model (j3J) the parameters' estimates are computed iteratively. Let 

% = y-j 



us first take af^ = a, = yi. — y.., (3^ = /§,• = y.j — y.. and 



^q-j Si Tlij (jjij a i ftj ft^J ' ttj ' Pj 

The is equal to the estimator of k in the classical Tukey test. 

The iteration procedure continues by updating estimates one by one (while 
the rest of parameters are fixed): 



E J (l+fc ( - 1) -/3 J ( ^ 1) ) 2 
n) _ E t (^-A-°, ( - 1) )-(l+fc ( "- 1) -"r i) ) 



P { 
• fc( ft ) 



s < s j (^-^"" 1) -^"" 1) -^)- a .-" " 1) -' ( "" 1) 



^3 
1 



Surprisingly, it seems that one iteration is just enough to converge in a vast 
majority of cases. Therefore, for a simplicity reason let us define 

Rss = 2 2 (ifo - A - ^ - - k^P^) 2 ■ 

» 3 
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The likelihood ratio statistic of the modified Tukey test, i.e. a difference of 
twice log-likelihoods, equals 

RSSq — RSS 

V 2 

and is asymptotically ^-distributed with 1 degree of freedom. 

occ pec 

The consistent estimate of a residual variance a 2 equals s 2 = -fz^zr an d ^2 
is approximately x 2 -distributed with ab — a — b degrees of freedom. Thus, using 
a linear approximation of the nonlinear model (EJ) 

RSSq — RSS . . 

ab—a—b 



is F-distributed with 1 and ab — a — b degrees of freedom. Easy manipulation 
of PJ gives the modified Tukey test which rejects the additivity hypothesis if 
and only if 

RSS > RSS (l + — -F haM (l - a)) , 

V ab — a — b J 

where i*i o&-a-6(l ~~ &) stands for 1 — a quantile of F-distribution with 1 and 
ab — a — b degrees of freedom. 

Now we will return to the simulation study from Section 3. For interaction A 
the power of the modified test is almost equal to the power of Tukey test. For 
interaction B the power of the tests is compared on Figure [2j the power of 
modified test is much higher than the power of Tukey test. 

Theoretically, we may expect the modified test to be conservative because just 
one iteration does not find precisely the maximum of model (J3]) likelihood. 
However, as we will see in the following part a situation for a small number 
of rows or columns is quite opposite. 



4-2 Small sample adjustment 



If the left part of Figure [2] would be magnified enough it will show that the 
modified test does not work properly (type-I-risk = 6%). The reason is that 
the likelihood r atio test statistic converges to ^-distribution rather slowly (see 
Bartlett and a correction for small sample size is needed. We present 



two possibilities that are recommended if a number of rows or columns is 
below 20 (empirical threshold based on simulations). 
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Fig. 2. Power dependence on k, b (b = 10 left, b = 50 right) for interaction type B. 
Tukey test solid line, Mandel test dashed line, Johnson - Graybill test dotted line, 
LBI test dot-dash line, Tusell test long dash line, modified Tukey test two dash line. 
The proposed modification improved Tukey test and for large k almost reach power 
of omnibus tests. 

One possibility to overcome this obstacle is a permutation test, i.e. generate 
data as follows 

yf rm \t) = /2 + af ] + fif + r Wij{t): t = 1, . . . , 

where ir(t) is a random permutation of indexes of R matrix. For each t the 
statistic of interest S {perm) (t) = RSS (t) - RSS(t) is computed. The critical 
value equals (1 - d) ■ 100% quantile of S^ perm \t), t = 1, . . . , N^ erm \ 

The second possibility is to estimate the residual variance s 2 = a f_^_ 6 and 
then generate samples of a distribution 

y (sam P le) {t) = . + .(0) + ^(0) + e ^W) f = ^ ^ N(samp le) 

where (^if EW ^)(t) are i.i.d. generated from a normal distribution with zero 
mean and variance s 2 . This is simply parametric bootstrap on residuals. 

The proposed statistic of interest is abs(k^) mirroring deviation from null 
hypothesis k = 0. As in the permutation test the additivity hypothesis is 
rejected if more than (1 — d) • 100% of sampled statistics lie below the statistic 
based on real data. 



5 Conclusion 

We have proposed a modification of the Tukey additivity test. The modified 
test performs almost as good as Tukey test when the interaction is a product 
of main effects but should be recommended if we also request reasonable power 



in case of more general interaction schemes. Problems with small sample size 
may be overcome by permutation test or parametric bootstrap on residuals. 

All mentioned tests are implemented in R package AdditivityTests that 
may be downloaded on http://github.com/rakosnicek/additivityTests. 

As far as we are informed, this is the first R implementation of additivity tests 
with the exception of the Tukey test. 
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